arXiv:1507.04592vl [cs.IT] 16Jul2015 


1 


Energy-Efficient Hybrid Analog and Digital 
Precoding for mm Wave MIMO Systems with Large 

Antenna Arrays 

Xinyu Gao, Student Member, IEEE, Linglong Dai, Senior Member, IEEE, Shuangfeng Han, Member, IEEE, 
Chih-Lin I, Senior Member, IEEE, and Robert W. Heath Jr., Eellow, IEEE 


Abstract —Millimeter wave (imnWave) MIMO will likely use 
hybrid analog and digital precoding, which uses a small number 
of RF chains to avoid energy consumption associated with 
mixed signal components like analog-to-digital components not to 
mention baseband processing complexity. However, most hybrid 
precoding techniques consider a fully-connected architecture 
requiring a large number of phase shifters, which is also energy- 
intensive. In this paper, we focus on the more energy-efficient 
hybrid precoding with sub-connected architecture, and propose a 
successive interference cancelation (SlC)-based hybrid precoding 
with near-optimal performance and low complexity. Inspired 
by the idea of SIC for multi-user signal detection, we first 
propose to decompose the total achievable rate optimization 
problem with non-convex constraints into a series of simple 
sub-rate optimization problems, each of which only considers 
one sub-antenna array. Then, we prove that maximizing the 
achievable sub-rate of each sub-antenna array is equivalent to 
simply seeking a precoding vector sufficiently close (in terms 
of Euclidean distance) to the unconstrained optimal solution. 
Finally, we propose a low-complexity algorithm to realize SIC- 
based hybrid precoding, which can avoid the need for the singular 
value decomposition (SVD) and matrix inversion. Complexity 
evaluation shows that the complexity of SIC-based hybrid pre¬ 
coding is only about 10% as complex as that of the recently 
proposed spatially sparse precoding in typical mm Wave MIMO 
systems. Simulation results verify the near-optimal performance 
of SIC-based hybrid precoding. 

Index Terms —MIMO, mm Wave communications, hybrid pre¬ 
coding, energy-efficient, 5G. 

I. Introduction 

T He integration of millimeter-wave (mmWave) and 
multiple-input multiple-output (MIMO) can achieve or¬ 
ders of magnitude increase in rates due to larger bandwidth and 
greater spectral efficiency ID . This makes mm Wave MIMO as 
a promising technique for future 5G wireless communication 
systems |2l. On one hand, the decreased wavelength associated 
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with high frequencies of mmWave enables a large antenna 
array to be packed in small physical dimension On the 
other hand, the large antenna array can provide sufficient 
antenna gain to compensate for the severe attenuation of 
mmWave signals due to path loss, oxygen absorption, and rain¬ 
fall effect in. Additionally, the large antenna array can also 
support the transmission of multiple data streams to improve 
the spectral efficiency through the use of precoding 0. 

For MIMO in conventional cellular frequency band (e.g., 
2-3 GHz), precoding is entirely realized in the digital domain 
to cancel the interferences between different data streams. 
Digital precoding requires an expensive radio frequency (RF) 
chain (including digital-to-analog converter, up converter, etc.) 
for every antenna. In mmWave MIMO system with a large 
number of antennas, it will bring prohibitively high energy 
consumption and hardware complexity. To solve this problem, 
mmWave MIMO prefers the more energy-efficient hybrid ana¬ 
log and digital precoding 13, which can significantly reduce 
the number of required RF chains. Specifically, the transmitted 
signals are first precoded by the digital precoding of a small 
dimension to guarantee the performance, and then precoded 
again by the analog precoding of a large dimension to save 
the energy consumption and reduce the hardware complexity. 

To realize the hybrid precoding in practice, two categories 
of techniques have been proposed recently. The first category 
is based on the spatially sparse precoding Q-Hl, which for¬ 
mulates the achievable rate optimization problem as a sparse 
approximation problem and solves it by orthogonal matching 
pursuit (OMP) 191 to achieve the near-optimal performance. 
The second category of hybrid precoding based on codebook 
is proposed in lfT0l - lfT2ll . which involves an iterative searching 
procedure among the predefined codebook to find the optimal 
hybrid precoding matrix. These algorithms are all designed for 
the hybrid precoding with fully-connected architecture, where 
each RF chain is connected to all BS antennas via phase 
shifters. As the number of BS antennas is very large (e.g., 
256 as considered in Ifi)), the fully-connected architecture has 
two possible limitations. First, it requires thousands of phase 
shifters like the giant phased array radar to realize the analog 
precoding 03, which leads to both high energy consumption 
and hardware complexity. Second, each RF chain will drive 
hundreds of BS antennas, which is also energy-intensive lfT3l . 
By contrast, the hybrid precoding with sub-connected archi¬ 
tecture, where each RF chain is connected to only a subset 
of BS antennas, can reduce the number of required phase 
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shifters without obvious performance loss Therefore, the 
sub-connected architecture is expected to be more energy- 
efficient and implementation-practical for mmWave MIMO 
systems. Unfortunately, the design of hybrid precoding with 
sub-connected architecture is still an open problem a, m, 
as the sub-connected architecture changes the constraints on 
the original hybrid precoding problem. 

In this paper, we propose a successive interference cancela¬ 
tion (SlC)-based hybrid precoding with sub-connected archi¬ 
tecture. The contributions of this paper can be summarized as 
follows. 

1) Inspired by the idea of SIC derived for multi-user signal 
detection CSl, we propose to decompose the total achievable 
rate optimization problem with non-convex constraints into a 
series of simple sub-rate optimization problems, each of which 
only considers one sub-antenna array. Then, we can maximize 
the achievable sub-rate of each sub-antenna array one by one 
until the last sub-antenna array is considered. 

2) We prove that maximizing the achievable sub-rate of each 
sub-antenna array is equivalent to seeking a precoding vector 
which has the smallest Euclidean distance to the unconstrained 
optimal solution. Based on this fact, we can easily obtain the 
optimal precoding vector for each sub-antenna array. 

3) We further propose a low-complexity algorithm to realize 
the SIC-based precoding, which avoids the need for singular 
value decomposition (SVD) and matrix inversion. Complexity 
evaluation shows that the complexity of SIC-based precoding 
is only about 10% as complex as that of the spatially sparse 
precoding 0 in typical mmWave MIMO systems, while it 
can still achieve the near-optimal performance as verified by 
simulation results. 

It is worth pointing out that to the best of the authors’ 
knowledge, our work in this paper is the first one that considers 
the hybrid precoding design with sub-connected architecture. 

The rest of the paper is organized as follows. Section |II] 
briefly introduces the system model of mmWave MIMO. 
Sec tionUni specifies the proposed SIC-based hybrid precoding, 
together with the complexity evaluation. The simulation results 
of the achievable rate are shown in Section El Finally, 
conclusions are drawn in Section IV] 

Notation: Lower-case and upper-case boldface letters denote 
vectors and matrices, respectively; (•)^, (•)^, 

I'l denote the transpose, conjugate transpose, inversion, and 
determinant of a matrix, respectively; H-Hj^ and 11-112 denote 
the li- and Z2-norm of a vector, respectively; H-H^ denotes the 
Frobenius norm of a matrix; Re{-} and Im{-} denote the real 
part and imaginary part of a complex number, respectively; 
E(-) denotes the expectation; Finally, Ijv is the N xN identity 
matrix. 


II. System Model 

Fig. I illustrates two typical architectures for hybrid pre¬ 
coding in mmWave MIMO systems, i.e., the fully-connected 
architecture as shown in Fig. 1 (a) and the sub-connected 
architecture as shown in Fig. 1 (b). In both cases the BS has 
NM antennas but only N RF chains. From Fig. 1, we observe 
that the sub-connected architecture will likely be more energy- 
efficient, since it only requires NM phase shifters, while the 
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Fig. 1. Two typical architectures of the hybrid precoding in mmWave MIMO 
systems: (a) Fully-connected architecture, where each RF chain is connected 
to all BS antennas; (b) Sub-connected architecture, where each RF chain is 
connected to only a subset of BS antennas. 


fully-connected architecture requires N^M phase shifters. To 
fully achieve the spatial multiplexing gain, the BS usually 
transmits N independent data streams to users employing K 
receive antennas 13. 

In the sub-connected architecture as shown in Fig. I 
(b), N data streams in the baseband are precoded by 
the digital precoder D. In cases where complexity is a 
concern, D can be further specialized to be a diago¬ 
nal matrix as D = diag [di,d 2 , - ■ ■ , cZat], where G K for 
n = l,2,-- - ,N |[3- Then the role of D essentially performs 
some power allocation. After passing through the correspond¬ 
ing RF chain, the digital-domain signal from each RF chain is 
delivered to only M phase shifters ifThll to perform the analog 
precoding, which can be denoted by the analog weighting 
vector a„ G whose elements have the same amplitude 

1/a/M but different phases ifThl . After the analog precoding, 
each data stream is finally transmitted by a sub-antenna array 
with only M antennas associated with the corresponding RF 
chain. Then, the received signal vector y = [yi, t/2, • ‘ 
at the user in a narrowband system Q can be presented as 


y = pHADs -f- n = pHPs + n, (1) 


where p is the average received power; H G denotes 

the channel matrix, A is the NM x N analog precoding 
matrix comprising N analog weighting vectors 


ai 0 ... 0 

0 a2 0 


( 2 ) 


0 0 




NMxN 


’while mmWave systems are expected to be broadband as in prior work (3, 
the narrowband system can be regarded as a reasonable first step. The 
extension to broadband system is an interesting topic of future work. 
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s = [si, S 2 , • • •, represents the transmitted signal vector 
in the baseband, and usually E(ss^) = ■^In is assumed 
for the normalized signal power 0. P = AD presents the 
hybrid precoding matrix of size NM x N, which satisfies 
||P|li^’ < to meet the total transmit power constraint 1 ^ . 
Finally, n = [ni,n 2 ,- ' is an additive white Gaussian 

noise (AWGN) vector, whose entries follow the independent 
and identical distribution (i.i.d.) CAf{0,(j'^). 

It is known that mmWave channel H will not likely follow 
the rich-scattering model assumed at low frequencies due to 
the limited number of scatters in the mmWave prorogation 
environment ||2l. In this paper, we adopt the geometric Saleh- 
Valenzuela channel model for mmWave communications, 
which was also used in related work in ini as 


L 

H - 7 E 01) At {4>l 6\) f, (</.[, en ff (</>*, 6 \), 

(3) 

where 7 = is a normalization factor, L is the number 

of effective channel paths corresponding to the limited number 
of scatters, and we usually have L < N for mmWave commu¬ 
nication systems, a; G C is the gain of the kh path. </)[ (0*) 
and ipl (0[) are the azimuth (elevation) angles of departure and 
arrival (AoDs/AoAs), respectively. At and A^. 

denote the transmit and receive antenna array gain at a specific 
AoD and AoA, respectively. For simplicity but without loss 
of generality. At (</>;, 0 *) and A^. (())[, 0 [) can be set as one 
within the range of AoDs/AoAs Us). Finally, ft and 

fr are the antenna array response vectors depending 

on the antenna array structures at the BS and the user, respec¬ 
tively. For the uniform linear array (ULA) with U elements, 
the array response vector can be presented as na 




1 gf^dsin( 0 ) 


pi(G-l)^dsin(0) 


(4) 


where A denotes the wavelength of the signal, and d is the 
antenna spacing. Note that here we abandon the subscripts 
{t,r} in (O and we also do not include 0 in (01 since the 
ULA response vector is independent of the elevation angle. 
Additionally, when we consider the uniform planar array 
(UFA) with Wi and W 2 elements (W 1 W 2 = U) on horizon 
and vertical, respectively, the array response vector can be 
given by ini 


1 


fuPA (</>, 0) = 


1 ... p 3 ^d(xsin(<l>)s\n{e)+ycos{e)) 




— sin(^) sin(0)+(VK2 —1) cos(0)) 


(5) 


where 0 < a; < {Wi — 1 ) and 0 < y < {W 2 — 1 ). 


III. SIC-Based FIybrid Precoding for mmWave 
MIMO Systems 

In this section, we propose a low-complexity SIC-based 
hybrid precoding to achieve the near-optimal performance. The 
evaluation of computational complexity is also provided to 
show its advantages over current solutions. 


A. Structure of SIC-based hybrid precoding 

The final aim of precoding is to maximize the total achiev¬ 
able rate R of mmWave MIMO systems, which can be 
expressed as El 


R = 


l 0 g 2 ( 


W ■ 


Acr 2 


HPP^H^ 


)■ 


( 6 ) 


According to the system model (1) in Section II, 
since the hybrid precoding matrix P can be represented 
as P = AD = diag {ai, • • • , a^r} • diag {di, • • • , d^}, there 
are three constraints for the design of P. 


Constraint 1: P should be a block diagonal matrix similar to 
the form of A as shown in (2), i.e., P = diag {pi, • • • , Pat}, 
where p„ = d„a„ is the M x 1 non-zero vector of the nth 
C 0 lumnp„ 0 fP, i.e., P„ = [OixM(n-l), Pn, OixM(Ar-n)] i 

Constraint 2: The non-zero elements of each column of P 
should have the same amplitude, since the digital precoding 
matrix D is a diagonal matrix, and the amplitude of non-zero 
elements of the analog precoding matrix A is fixed to 1 /v/M; 

Constraint 3: The Frobenius norm of P should satisfy 
||P|If — ^ niS 6 t the total transmit power constraint. 

Unfortunately, these non-convex constraints on P make 
maximizing the total achievable rate (| 6 l) very difficult to be 
solved. However, based on the special block diagonal structure 
of the hybrid precoding matrix P, we can observe that the 
precoding on different sub-antenna arrays are independent. 
This inspires us to decompose the total achievable rate (El 
into a series of sub-rate optimization problems, each of which 
only considers one sub-antenna array. 

In particular, we can divide the hybrid precoding matrix 
P as P = [Pat-i Pat], where Pat is the TVth column of P, 
and Pat-i is an NM x {N — 1) matrix containing the first 
(N — 1 ) columns of P. Then, the total achievable rate i? in (El 
can be rewritten as 


R = 


l 0 g 2 ( 
l 0 g 2 ( 
l 0 g 2 ( 


In + 
Iat + 
lAf + 


Acr 2 

P 

Acr 2 

P 


HPP^H^ 


H[P^_i p^] [P^_ip^]W 


HP^_iP^_iH 


iH 


+ 


P 


HpjvP^H" 


Acr 2 


(“) , 


log 2 (|TjV-l|) -f log 2 ( Itv + -^T^^.iHpArp^H^ ) 


(b) 

= l 0 g 2 


\Tn-i 


I) +l0g2 ( 


1 + 


P 


Na'^ 




(7) 


where (a) is obtained by defining the auxiliary ma¬ 
trix Tn -1 =In + j^UP , and (b) is true 

due to the fact that |I-f XY| = |I-f YX| by defining 
X = and Y = p^H^. Note that the second term 

log 2 (1 + on the right side of (Q is 

the achievable sub-rate of the Ath sub-antenna array, while 
the first term log 2 (|Tjv-i|) shares the same form as (E|. 
This observation implies that we can further decompose 
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log 2 (|Tiv_i|) using the similar method in (17]i as 
log2 (|T^_2|) + log2 (l + ^p^_iH^T^l_2Hp^_l) . 

Then, after N such decompositions, the total achievable rate 
i? in (|6]l can be presented as 

N 

E = ^log 2 (l + ^p"H«T;l,Hp„), ( 8 ) 

n=l 

where we have T„ = I^v + and Tq = Iat. 

From ([8]), we observe that the total achievable rate optimization 
problem can be transformed into a series of sub-rate optimiza¬ 
tion problems of sub-antenna arrays, which can be optimized 
one by one. After that, inspired by the idea of SIC for multi¬ 
user signal detection ca, we can optimize the achievable 
sub-rate of the first sub-antenna array and update the matrix 
Ti. Then, the similar method can be utilized to optimize 
the achievable sub-rate of the second sub-antenna array. Such 
procedure will be executed until the last sub-antenna array 
is considered. Fig. 2 shows the diagram of the proposed SIC- 
based hybrid precoding. Next, we will discuss how to optimize 
the achievable sub-rate of each sub-antenna array. 



Fig. 2. Diagram of the proposed SIC-based hybrid precoding. 

B. Solution to the sub-rate optimization problem 

In this subsection, we focus on the sub-rate optimization 
problem of the nth sub-antenna array, which can be directly 
applied to other sub-antenna arrays. According to ([8]), the sub¬ 
rate optimization problem of the nth sub-antenna array by 
designing the nth precoding vector p„ can be stated as 

p°P* = argm^xlog2 + -^^p^G„_ip„) , (9) 

where G„_i is defined as G„_i = T is the set 

of all feasible vectors satisfying the three constraints described 
in Section III-A. Note that the nth precoding vector p„ 
only has M non-zero elements from the {M{n— 1) -I- l)th 
one to the (Mn)th one. Therefore, the sub-rate optimization 
problem @ can be equivalently written as 

P™''* = argmaxlog 2 (|l + G„-ip„) , (10) 

where T includes all possible M x 1 vectors satisfying Con¬ 
straint 2 and Constraint 3, G„_i of size M x M is the 
corresponding sub-matrix of G„_i by only keeping the rows 
and columns of G„_i from the (M(n — 1) -|- l)th one to the 
(Mn)th one, which can be presented as 

G„_i = RG„_iR^ = RH^T-liHR^, (11) 


where R = [ OMxM(n-i) Im OMxMiN-n) ] is the cor¬ 
responding selection matrix. 

Define the singular value decomposition (SVD) of the 
Hermitian matrix G„_i as G„_i = VSV^, where S is an 
M X M diagonal matrix containing the singular values of 
G„_i in a decreasing order, and V is an M x M unitary 
matrix. It is known that the optimal unconstrained precoding 
vector of (fTol i is the first column Vi of V, i.e., the first right 
singular vector of G„_i According to the constraints 
mentioned in Section Hill- A. we cannot directly choose p°P* 
as Vi since the elements of Vi do not obey the constraint of 
same amplitude (i.e.. Constraint 2). To find a practical solution 
to the sub-rate optimization problem (fTOl) . we need to further 
convert (fTol i into another form, which is given by the following 
Proposition 1. 

Proposition 1. The optimization problem m 

p°P‘ = argmaxlog2 (l -b -^^p^G„_ip„) 

is equivalent to the following problem 

= argmin ||vi - p „||2 , (12) 


where Vi is the first right singular vector of G„_i. 


Proof: See Appendix A. ■ 

Proposition 1 indicates that we can find a feasible pre¬ 
coding vector p„, which is sufficiently close (in terms of 
Euclidean distance) to the optimal but unpractical precoding 
vector vi, to maximize the achievable sub-rate of the nth sub¬ 
antenna array. Since p„ = d„a„ according to O, the target 
||vi — Prill 2 in (O can be rewritten as 

l|vi - Pnlla 

= (vi dji^Tb) (vi 

= vf Vi + - 2dnRe (vf a^) 

*■= 1 -b - 2d,jRe (vf a„) 

= (d„ - Re (vf a,r))^ ('"f ^")]^) ’ 


where (a) is obtained based on the facts that vf Vi = 1 and 
af a„ = 1, since Vi is the first column of the unitary matrix 
V and each element of a„ has the same amplitude 1/y/M. 

From we observe that the distance between 

p„ and Vi consists of two parts. The first one is 
(dri — Re (vf an)) , which can be minimized to zero 
by choosing d„ = Re(vfa„). The second one is 

(l-[Re(vf an )]^^, which can be minimized by 
maximizing |Re(vfa,j)|. Note that both a„ and Vi 
have a fixed power of one, i.e., vf vi = 1 and af a„ = 1. 
Therefore, the optimal a°P* to maximize |Re (vf a„) | is 


a°P* 

cl„ 


1 r..7ansle(vi) 

s/M 


(14) 


where angle(vi) denotes the phase vector of Vi, i.e., each 
element of shares the same phase as the corresponding 
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element of vi. Accordingly, the optimal choice of is 


d°P* = Re (vfa„) = 


^/M 


Re 


(-1 


-f^„jangle(vi)'\ ll'^llll 




■ 

(15) 

Based on (fT4l i and (flSl l. the optimal solution p°P* to the opti¬ 
mization problem (fT2l i (or equivalently (fTOl i) can be obtained 
by 

pT = (16) 


It is worth pointing out that Vi is the first column of the uni¬ 
tary matrix V, each element Vi of vi (for z = 1, • • • , M) has 
the amplitude less than one. Therefore, we have iiprii2 < 1- 
Note that for all sub-antenna arrays, the optimal solution 
p°P* for n = 1, 2, • • • ,N have a similar form. Thus, we can 
conclude that 


P°P‘f^ = ||diag{pr,--- ,P?,P‘}||^<1V, (17) 


which demonstrates that the total transmit power constraint 
{Constraint 3) is satisfied. 

After we have acquired p°P‘ for the nth sub-antenna 
array, the matrices t„ = Iw + ^hp„p^h^ m and 
G„ = RH^T'iHR^ O can be updated. Then, the 
method described above for the nth sub-antenna array can 
be reused again to optimize the achievable sub-rate of the 
(n + l)th sub-antenna array. To sum up, solving the sub-rate 
optimization problem of the nth sub-antenna array consists of 
the following three steps. 

Step 1: Execute the SVD of G„_i to obtain Vi; 

Step 2: Let p°P* = ]g||vi|| as the optimal solution 

to the current nth sub-antenna array; 

Step 3: Update matrices T„ = Ijv -|- and 

G„ = RH^T~^HR^ for the next (n+l)th sub-antenna 
array. 

Note that although we can obtain the optimal solution p°P* 
by the method above, we need to compute the SVD of G„_i 
{Step 1) and the matrix G„ {Step 3) involving the matrix 
inversion of large size, which leads to high computational 
complexity as well as high hardware complexity. To this end, 
next we will propose a low-complexity algorithm to obtain 
p°P* to avoid the complicated SVD and matrix inversion. 


C. Low-complexity algorithm to obtain the optimal solution 

We start by considering how to avoid the SVD involving 
high computational complexity as well as a large number of 
divisions, which are difficult to be implemented in hardware. 
We can observe from Step 1 that the SVD of G„_i does not 
need to be computed to acquire S and V, as only the first 
column vi of V is enough to obtain p°^*. This observation 
inspires us to exploit the simple power iteration algorithm CD, 
which is used to compute the largest eigenvalue and the 
corresponding eigenvector of a diagonalizable matrix. Since 
G„_i is a Hermitian matrix, it follows that: 1) G„_i is also 
a diagonalizable matrix; 2) The singular values (right singular 
vectors) of G„_i are same as the eigenvalues (eigenvectors). 
Therefore, the power iteration algorithm can be also utilized to 


Input: (1) G„_i; 

(2) Initial solution 

(3) Maximum number of iteration S 
for 1 < s < S' 

1 ) 


2 ) = arg max 

z{s) 

3) if 1 < s < 2 

^(s) ^ 

else 
n(^) = 

end if 

4) = 
end for 












Output: (1) The largest singular value Ei = n 
(2) The first singular vector vi = 


= «.(S) 


(S) 


bS) 


Algorithm 1: Power iteration algorithm 


compute vi as well as the largest singular value Si of G„_i 
with low complexity. 

More specifically, as shown by the pseudo-code in Algo¬ 
rithm 1, the power iteration algorithm starts with an initial 
solution which is usually set as [1,1, • • • , 1]^ 

without loss of generality lfT9l . In each iteration, it first com¬ 
putes the auxiliary vector (s is the number 

of iterations) and then extracts the element of z^®) having 
the largest amplitude as After that, is updated 

as u^®) = for the next iteration. The power iteration 
algorithm will stop until the number of iterations reaches the 
predefined number S. Finally, and will 

be output as the largest singular value Si and the first right 
singular vector Vi of G„_i, respectively. 

According to mi, we know that 


m(®) = Si 



(18) 


where S 2 is the second largest singular value of G„_i. 
From (HI, we can conclude that ml®l will converges to Si 
as long as Si ^ S 2 . Similarly, when Si 7 ^ S 2 , ul®l/||ul®l 
will also converge to vi, i.e., 

/ ■, ul®l 

lim ml®l = Si, lim Ti— ,, = Vi. (19) 

s—>-oo s—>-oo ii(®) 

11"^ II 2 

Although the power iteration algorithm is convergent, its 
convergence rate may be slow if Si Ri S 2 based on (HI. 
To solve this problem, we propose to utilize the Aitken 
acceleration method ll20l to further increase the convergence 
rate of the power iteration algorithm. Specifically, we can 
compute 


n(®) = m(®)- 




(s') m'-'m,'- — 


for 1 < s < 2, 
for 2 < s < S'. 


( 20 ) 


Then, u*^®^ and Si will be correspondingly changed to 
u^®^ = and Si = respectively. 

Next, we will focus on how to reduce the complex¬ 
ity to compute the matrices T„ = I^r -f 
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and G„ = RH^T^^HR^, which involve the complicated 
matrix-to-matrix multiplication and matrix inversion of large 
size. In particular, with some standard mathematical manipu¬ 
lations, the computation of G„ can be signihcantly simplihed 
as shown by the following Proposition 2. 


Proposition 2. The matrix G„ = RH^T„ ^HR^, where 
Tn = Iat -f can be simplified as 


Gr 


Gn-1- 


N(7- 


■Efvivf 


H 


1 + 


P 

N 


Si 


( 21 ) 


where Ei and Vi are the largest singular value and first right 
singular vector o/G„_i, respectively. 


Proof: See Appendix B. ■ 

Proposition 2 implies that we can simply exploit Ei and 
Vi that have been obtained by Algorithm 1 as described 
above to update G„, which only involves one vector-to-vector 
multiplication instead of the complicated matrix-to-matrix 
multiplication and matrix inversion. Note that the evaluation of 
computational complexity will be discussed in detail in Section 
III-E. 


D. Summary of the proposed SIC-based hybrid precoding 
Based on the discussion so far, the pseudo-code of the 
proposed SIC-based hybrid precoding can be summarized in 
Algorithm 2, which can be explained as follows. The proposed 
SIC-based hybrid precoding starts by computing the largest 
singular value Ei and first right singular vector Vi of G„_i, 
which is achieved by Algorithm 1. After that, according to 
Section III-B, the optimal precoding vector for the nth sub¬ 
antenna array can be obtained by utilizing Vi. Finally, based 
on Proposition 2, G„ can be updated with low complexity 
for the next iteration. This procedure will be executed until 
the last (Afth) sub-antenna array is considered. Finally, after 
N iterations, the optimal digital, analog, and hybrid precoding 
matrix D, A, and P can be obtained, respectively. 


Input: Go 

for 1 <n < N 

1) Compute vi and Si of G„_i by Algorithm 1 

2) a°P*' = t ^tanglefvi) ^opt _ INi 111 

= ^||vi||^eJ“sie(vO ,[I4],.,[T6], 

3) Gn = Gn -1 - (Proposition 2) 

end for 

Output: (1) D = diag • • • , d^^} 

(2) A = diag {a°P‘, • • ■ , a'^‘} 

(3) P = AD 

Algorithm 2: SIC-based hybrid precoding 

It is worth pointing out that the idea of SIC-based hybrid 
precoding can be also extended to the combining at the user 
following the similar logic in 0. When the number of RF 
chains at the BS is smaller than that at the user, we first 
compute the optimal hybrid precoding matrix P according to 
Algorithm 2, where we assume that the combining matrix 
Q = I. Then, given the effective channel matrix HP, we 


can similarly obtain the optimal hybrid combining matrix Q 
by referring to Algorithm 2, where the input Gq and the 
optimal unconstrained solution vi should be correspondingly 
replaced. Conversely, when the number of RF chains at the 
BS is larger than that at the user, we can assume P = I and 
obtain the optimal hybrid combining matrix Q. After that, 
the optimal precoding matrix P can be acquired given the 
effective channel matrix QH. Additionally, to further improve 
the performance, we can combine the above method with 
the “Ping-pong” algorithm ifT^ . which involves an iteration 
procedure between the BS and the user, to jointly explore the 
optimal hybrid precoding and combining matrices pair. Further 
discussion about hybrid combining will be left for our future 
work. 

E. Complexity evaluation 

In this subsection, we provide the complexity evaluation 
of the proposed SIC-based hybrid precoding in terms of the 
required numbers of complex multiplications and divisions. 
From Algorithm 2, we can observe that the complexity of 
SIC-based hybrid precoding comes from the following four 
parts: 

1) The first one originates from the computation of 
Go = RH^HR-f^ according to Note that R is a selec¬ 
tion matrix and H has the size K x NM. Therefore, this part 
involves KM^ times of multiplications without any division. 

2) The second one is from executing Algorithm 1. It can be 

observed that in each iteration we need to compute a matrix- 
to-vector multiplication together with the 

Aitken acceleration method (l20l i. Therefore, we totally require 
S -f 2) — 4 and {2S — 2) times of multiplications and 
divisions, respectively. 

3) The third one stems from acquiring the optimal solution 
p°P‘ in step 2 of Algorithm 2. We can hnd that this part 
is quite simple, which only needs 2 times of multiplications 
without any division, since Vi has been obtained and is 
a fixed constant. 

4) The last one comes from the update of G„. According to 
Proposition 2, we know that this part mainly involves a outer 
product vivf^. Thus, it requires times of multiplications 
with only one division. 

To sum up, the proposed SIC-based hybrid precoding ap¬ 
proximately requires {NS -f K) times of multiplications 
and 2NS times of divisions. It is worth pointing out that 
the recently proposed spatially sparse precoding 0 requires 
O (N'^M -I- N'^L'^ -I- N^M'^L) times of multiplications and 
O (2W^) times of divisions, where L is the number of 
effective channel paths as dehned in Q. Considering the 
typical mm Wave MIMO system with N = 8, M = 8, K = 16, 
F = 3 0, we can observe that the complexity of SIC-based 
hybrid precoding is about 4x10^ times of multiplications and 
10^ times of divisions, where we set S' = 5 that is enough to 
guarantee the performance as will be verihed in Section IV. 
By contrast, the complexity of the spatially sparse precoding 
is about 5 x 10^ times of multiplications and 10^ times of 
divisions. Therefore, the proposed SIC-based hybrid precoding 
enjoys much lower complexity, which is only about 10 % as 
complex as that of the spatially sparse precoding. 
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Fig. 3. Achievable rate compaiison for an NM x X = 64 x 16 {N = 8j 
mm Wave MIMO system. 


IV. Simulation Results 

In this section, we provide the simulation results of the 
achievable rate to evaluate the performance of the proposed 
SIC-based hybrid precoding. We compare the performance 
of SIC-based hybrid precoding with the recently proposed 
spatially sparse precoding @ and the optimal unconstrained 
precoding based on the SVD of the channel matrix, which 
are both with fully-connected architecture. Additionally, we 
also include the conventional analog precoding and the optimal 
unconstrained precoding (i.e., = vi) which are both with 

sub-connected architecture ED as benchmarks for compari¬ 
son. 

The simulation parameters are described as follows. 
We generate the channel matrix according to the channel 
model ifTTll described in Section [III The number of effec¬ 
tive channel paths is L = 3. The carrier frequency is set as 
28GHz m. Both the transmit and receive antenna arrays are 
ULAs with antenna spacing d = A/2. Since the BS usually 
employs the directional antennas to eliminate interference and 
increase antenna gain El, the AoDs are assumed to follow the 
uniform distribution within [—f,f] - Meanwhile, due to the 
random position of users, we assume that the AoAs follow the 
uniform distribution within [—7r,7r], which means the omni¬ 
directional antennas are adopted by users. Furthermore, we set 
the maximum number of iterations S' = 5 to run Algorithm 
2. Finally, the signal-to-noise ratio (SNR) is defined as 

Fig. 3 shows the achievable rate comparison in mmWave 
MIMO system, where NM x K = 64 x 16 and the number 
of RF chains is N = 8. We can observe from Fig. 3 that 
the proposed SIC-based hybrid precoding outperforms the 
conventional analog precoding with sub-connected architecture 
in whole simulated SNR range. Meanwhile, Fig. 3 also verifies 
the near-optimal performance of SIC-based hybrid precoding, 
since it can achieve about 99% of the rate achieved by the 
optimal unconstrained precoding with sub-connected architec¬ 
ture. 

Fig. 4 compares the achievable rate in mmWave MIMO 



Fig. 4. Achievable rate comparison for an NM X K = 128 X 32 (W = 16) 
mmWave MIMO system. 


system with NM x K = 128 x 32 and N = 16, where we 
can observe similar trends as those from Fig. 3. More impor¬ 
tantly, Fig. 3 and Fig. 4 show that the performance of SIC- 
based hybrid precoding is also close to the spatially sparse 
precoding and the optimal unconstrained precoding with fully- 
connected architecture. For example, when SNR = 0 dB, our 
method can achieve more than 90% of the rate achieved by 
the near-optimal spatially sparse precoding in both simulated 
mmWave MIMO configurations. Considering the low energy 
consumption and computational complexity of the proposed 
SIC-based hybrid precoding as analyzed before, we can further 
conclude that SIC-based hybrid precoding can achieve much 
better trade-off among the performance, energy consumption, 
and computational complexity. 

Fig. 5 provides a achievable rate comparison in mmWave 
MIMO systems against the numbers of BS and user antennas, 
where NM = K and the number of RF chains is fixed to 
N = 8. We can find that the performance of the proposed 
SIC-based hybrid precoding can be improved by increasing the 
number of BS and user antennas, which involves much lower 
energy consumption than increasing the number of energy- 
intensive RF chains m. 

Fig. 6 shows the achievable rate comparison against the 
numbers of user antennas K, where NM = 64 and N = 8. 
We can imply from Fig. 6 that the performance loss of SIC- 
based hybrid precoding due to the sub-connected architecture 
can be compensated by increasing the number of user antennas 
K. For example, the achievable rate of SIC-based hybrid 
precoding when iC = 30 is the same as that of the spa¬ 
tially sparse precoding when K = 20. Note that in this case, 
the required number of phase shifters of SIC-based hybrid 
precoding is NM = 64 and each RF chain only needs to 
drive 8 BS antennas, while for the spatially sparse precoding, 
the number of required phase shifters is N'^M = 512 and 
each RF chain has to drive 64 BS antennas. By contrast, 
the cost of increasing the number of user antennas K will 
be negligible since the power consumption of user antenna 
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inversion. Complexity evaluation showed that the complexity 
of the proposed SIC-based hybrid precoding is only about 
10 % as complex as that ot the recently proposed spatially 
sparse precoding with fully-connected architecture in typical 
mm Wave MIMO system. Simulation results verified the near- 
optimal performance of SIC-based hybrid precoding, and 
implied that the performance loss induced by sub-connected 
architecture can be compensated by increasing the number of 
antennas. This may be a reasonable tradeoff versus increasing 
the number of phase shifters as required in the fully-connected 
architecture. Our further work will focus on the limited feed¬ 
back scenario, where the channel state information may be not 
perfect and the angles of phase shifters are quantified. 


Appendix A 

Proof of Proposition 1 


Fig. 5. Achievable rate comparison against the numbers of BS and user 
antennas (NM = K), where N = 8. 


Define the target of the optimization problem (fTOl) as 



Fig. 6. Achievable rate comparison against the number of user antennas K, 
where NM = 64 and N = 8. 


Rn = l 0 g 2 (l -b J^Pn , (22) 

and the SVD of G„_i as G„_i = VSV^. Then, by sepa¬ 
rating the matrices S and V into two parts; 


Si 

0 


0 

S 2 


s = 

Rn in (I 22 I 1 can be rewritten as 

Rn = 


V = [vi V 2 


(23) 


l0g2 


7Va2P"G"-iP") 


l0g2 




l0g2 


p 

Na^ 




xpf [vi V 2 ] 

El 0 

0 S 2 

[Vl V2]^p„^ 

l0g2 


^^2PnViSivfp„ 




S2Vf p„) . 

(24) 


is usually small Cl. Therefore, we can conclude that the 
proposed SIC-based hybrid precoding is more energy-efficient 
and implementation-practical. 


Since we aim to find a vector p„ sufficiently “close” to Vi, 
it is reasonable to assume that p„ is approximately orthogonal 
to the matrix V 2 , i.e., p ^V 2 0 i). Then, dUi can be 
simplified as 


V. Conclusions 

In this paper, we proposed a SIC-based hybrid precoding 
with sub-connected architecture for mm Wave MIMO systems. 
We first showed that the total achievable rate optimization 
problem with non-convex constraints can be decomposed into 
a series of sub-rate optimization problems, each of which 
only considers one sub-antenna array. Then, we proved that 
the sub-rate optimization problem of each sub-antenna array 
can be solved by simply seeking a precoding vector suffi¬ 
ciently close to the unconstrained optimal solution. Finally, 
a low-complexity algorithm was proposed to realize SIC- 
based precoding without the complicated SVD and matrix 


Rn log 2 ( 1 + -^p^vivf p. 


Na^‘ 


= l 0 g 2 


1 + 


Na^ I 




-Pn Vivfp„) 


(b) 


l0g2 1 + 


P^l \ 
Na^ ) 


+ log2 (Pn Vivfp„) 


(25) 


where (a) is obtained by using the formula 
I + XY = (I-bX) (l-(I-bX)-^X(I-Y)) a, where 
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we define X = and Y = 


employing the high SNR approximation 


Vivf^p„; (b) is valid by 


i.e.. 


pSi \ 

iVcr2 J 


-1 


pSi 

A^cr 2 


1 . 


(26) 


From (l25l l. we can observe that maximizing i?„ is equivalent 


to maximizing p^vivf p„ = Hp^ViH^, the square of inner 
product between two vectors p„ and Vi. Note that Vi is a 
fixed vector. Therefore, exploring a vector p„, which has the 
largest projection on vi, will lead to the smallest Euclidean 
distance to vi as well. Based on this fact, we can conclude that 
the optimization problem (fTOl i is equivalent to the following 
problem 

(27) 






= argmin||vi 


Appendix B 

Proof of Proposition 2 

We first consider the matrix T„ = Iat + 
which should be inversed to compute G„ ([Til l. By partitioning 
P„ as P„ = [Pn-i Pn], T„ can be rewritten as 
P 


Tra = Iat + 

= In + 
= In + 
= T„,_i 


HtjtH 


-HP„P"H 

Na^ " 


^ ttH 


Na^ 

P 


H[P„_i p„] [P„_i p„]"H 


tH 


Na^ 


HP„_iP"_iH 

^ Hp„p^H^. 




iVcr 2 


HpnP^H 


iVcr 2 


(28) 


Then, by utilizing the Sherman-Morrison formula im Eq 
2.1.4] 

A^iuv^A-i 


= A~" - 


1 + v^A^^u 


(29) 


^ can be presented as 
T-i = fT„_i + 


Acr 2 

£ rri^l 


Hp„p"H^)’ 


— — 


1 ^ ^ 


N<7^ 


Substituting (l30l l into G„ = we have 

G„ = 

1 + ^p^H^T;^Hp„ ^ 

I^n —iPnPn ^n—1 


(30) 



H 


1 T £^2 P^ G„_iPn 
Then, according to (fTTIi . G„ can be obtained by 
G„ = RG„R^ 

_ Tj ( J^Gn-lPnPn G„-i 

— XV — 


(31) 


= G„_i - 


1 + l^P^G„_ip„ 

—p—C' f\ 

ATcr^ IF'nF'n 1 

^ 3” Nct^ P^G„_iP„ 


R 


H 


(32) 


Note that in Section III-B, we have obtained the optimal 
solution p°P* which is sufficiently close to vi. Thus, ( |32] | can 
be well approximated by replacing p„ with Vi as 


G„ — G„_i — 


G„_i- 




Gn — lPnPn Gn — 


^ 3” Nct^ P^G„_iP„ 


P 

7f^ 


G„iviVj^ G„ 


^ Gn-lVl 


(o) — 

- G,_ 1 - 


Nc 


rS^ViV 


H 


1 + l^sl 


(33) 


where (a) is true due to fact that vf^G„_i = Sivf^, since 
G„_i is an Hermitian matrix. ■ 
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